Phylogeny of the Megascolecidae and 
Crassiclitellata (Annelida, Oligochaeta): 
combined versus partitioned analysis using 
nuclear (28S) and mitochondrial (12S, 16S) rDNA 


Barrie G. M. JAMIESON 

Department of Zoology and Entomology, University of Queensland, 

Brisbane 4072, Queensland (Australia) 
bjamieson@zen.uq.edu.au 

Simon TILLIER 
Annie TILLIER 

Departement Systematique et Evolution et Service de Systematique moleculaire, 

Museum national d’Histoire naturelle, 
43 rue Cuvier, F-75231 Paris cedex 05 (France) 

Jean-Lou JUSTINE 

Laboratoire de Biologie parasitaire, Protistologie, Helminthologie, 
Departement Systematique et Evolution, 
Museum national d’Histoire naturelle, 
61 rue Button, F-75321 Paris cedex 05 (France) 

present address: 

UR “Connaissance des Faunes et Flores Marines Tropicales”, 

Centre IRD de Noumea, B.P. A5, 
98848 Noumea cedex (Nouvelle-Caledonie) 

Edmund LING 

Department of Zoology and Entomology, University of Queensland, 

Brisbane 4072, Queensland (Australia) 

Sam JAMES 

Department of Life Sciences, Maharishi University of Management, 

Fairfield, Iowa 52557 (USA) 

Keith MCDONALD 

Queensland Parks and Wildlife Service, 
PO Box 834, Atherton 4883, Queensland (Australia) 

Andrew F. HUGALL 

Department of Zoology and Entomology, University of Queensland, 

Brisbane 4072, Queensland (Australia) 


ZOOSYSTEMA • 2002 • 24 (4) © Publications Scientifiques du Museum national d’Histoire naturelle, Paris, www.zoosystema.com 


707 





Jamieson B. G. M. et al. 


Jamieson B. G. M., Tillier S., Tillier A., Justine J.-L., Ling E., James S., McDonald K. & 
Hugall A. F. 2002. — Phylogeny of the Megascolecidae and Crassiclitellata (Annelida, 
Oligochaeta): combined versus partitioned analysis using nuclear (28S) and mitochondrial 
(12S, 16S) rDNA. Zoosystema 24 (4): 707-734. 


KEYWORDS 

Annelida, 
Oligochaeta, 
Clitellata, 
Crassiclitellata, 
Megascolecidae, 
molecular systematics, 
maximum likelihood, 
consensus, 
parametric tests. 


ABSTRACT 

Analysis of megascolecoid oligochaete (earthworms and allies) nuclear 28S 
rDNA and mitochondrial 12S and 16S rDNA using parsimony and likeli¬ 
hood, partition support and likelihood ratio tests, indicates that all higher, 
suprageneric, classifications within the Megascolecidae are incompatible with 
the molecular data. The two data-sets (nuclear and mitochondrial) may have 
historical or methodological incompatabilities therefore we explore the effect 
on measures of support and conflict at three levels: 1) separate analysis; 2) 
combining the data with single model; and 3) combining the relative support 
for competing topologies using separate models. Resolving power is identified 
via partition support, consensus and four competing likelihood ratio tests. 
Combined analysis identifies hidden support and conflict; more complex 
models reduce this conflict, possibly owing to removal of dynamic hetero¬ 
geneity, and give a more resolved consensus. This is incompatible with mor¬ 
phological classifications, rejection of which varies among likelihood ratio 
tests. Both congruence and combined power support our conclusions: most 
of the groupings are based on homoplasies, for instance, multiple origin of 
racemose prostates or of « dichogastrin » meronephridia. The widely used 
classification of the non-ocnerodrilin Megascolecidae into three groups 
(Acanthodrilidae, with tubular prostates and holonephridia; Octochaetidae, 
with tubular prostates and meronephridia; and Megascolecidae, with race¬ 
mose prostates) is not supported by molecular data. Monophyly of the 
Crassiclitellata Jamieson, 1988, oligochaetes with a multilayered clitellum, is 
confirmed. The results provide support for including the branchiobdellids 
and leeches in the Oligochaeta. 


RESUME 

Phylogenie des Megascolecidae et Crassiclitellata (Annelida, Oligochaeta) : 
analyse combinee contre analyse partitionee, utilisant lADNr nucleaire (28S) et 
mitochondrial (12S, 16S). 

Chez les oligochetes megascolecoides, une analyse de I’ADNr nucleaire (28S) 
et mitochondrial (12S et 16S), utilisant la parcimonie et la vraisemblance, le 
soutien des partitions et les tests de taux de vraisemblance, indique que toutes 
les classifications suprageneriques des Megascolecidae sont incompatibles avec 
les donnees moleculaires. Les deux jeux de donnees, nucleaire et mitochon¬ 
drial, peuvent presenter des incompatibilites historiques ou methodologiques 
et nous explorons done l’effet du soutien et du conflit sur les mesures, a trois 
niveaux : 1) analyses separees ; 2) combinaison des donnees avec un seul 
modele ; et 3) combinaison des soutiens pour des topologies en competition 
en utilisant des modeles separes. La puissance de resolution est identifiee par 
le soutien des partitions, le consensus, et quatre tests de taux de vraisemblance 
en competition. L’analyse combinee identifie le soutien cache et le conflit; 
des modeles plus complexes reduisent ce conflit, probablement parce qu’ils 
suppriment une heterogeneite dynamique et produisent un consensus mieux 
resolu. Ceci est incompatible avec les classifications morphologiques, qui sont 
plus ou moins rejetees en fonction du test de taux de vraisemblance utilise. La 
congruence et le soutien de pouvoir combine soutiennent nos conclusions : la 
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plupart des groupements ont ete bases sur des homoplasies, par exemple I’ori- 
gine multiple des prostates racemeuses et de la meronephridie de « type 
Dichogastrinae ». La classification largement utilisee des Megascolescidae 
non-Ocnerodrelinae en trois groupes (Acanthodrilidae, avec prostate tubulai- 
re et holonephridie ; Octochaetidae, avec prostate tubulaire et meronephri¬ 
die ; Megascolescidae, avec prostate racemeuse) n’est pas soutenue par les 
donnees moleculaires. La monophylie des Crassiclitellata Jamieson, 1988, oli- 
gochetes avec un clitellum a plusieurs couches, est confirmee. Les resultats 
fournissent des arguments pour l’inclusion des branchiobdellides et des hiru- 
dinees dans les oligochetes. 


INTRODUCTION 

General 

The Clitellata Michaelsen, 1919 are annelids 
which include the oligochaetes (earthworms and 
their allies), branchiobdellids (ectoparasites of 
freshwater crayfish) and leeches. They are defined 
by the possession of a modification of the epider¬ 
mis, the clitellum, which is located at least partly 
behind the female pores and which secretes a 
cocoon in which the eggs are laid. They were 
renamed Euclitellata by Jamieson (1983) because 
a clitellum also occurs in questid polychaetes 
though there anterior to the female pores 
(Jamieson 1983). Authorships of taxa indicated 
in Table 1 will not be repeated in the text except 
where required for clarity. 

Molecular studies have confirmed monophyly of 
the Clitellata, using 18S rRNA (Winnepenninckx 
et al. 1998; Martin et al. 2000; Martin 2001; 
Rota et al. 2001), cytochrome oxidase I (COI) 
(Siddall & Burreson 1998; Nylander et al. 1999) 
or elongation factor 1-alpha (Kojima 1998); see 
also a review by McHugh (2000). An apparent 
exception to clitellate monophyly in a parsimony 
analysis of 18S rRNA was rejected as being due to 
spurious attraction between two apparently poly- 
chaete sequences and the branchiobdellids 
(Martin 2001). With regard to the position of 
the Clitellata within the Annelida, molecular ana¬ 
lysis has indicated that (eu)clitellates form a clade 
within the Polychaeta and that polychaetes are a 
paraphyletic or polyphyletic group (Kojima 
1998; McHugh 2000; Martin 2001; Rota et al. 


2001). Relationships of oligochaetes, branchiob¬ 
dellids and leeches, within the Clitellata, have 
been more elusive of definition. 

Paraphyly of the Oligochaeta with leeches and/or 
branchiobdellids lying within the oligochaete 
clade, has long been suspected on morphological 
grounds (Michaelsen 1928-1932; Brinkhurst & 
Nemec 1986; Brinkhurst & Gelder 1989; 
Jamieson et al. 1987; Jamieson 1988; Purschke et 
al. 1993) and is being increasingly confirmed 
from molecular analyses. Siddall & Burreson 
(1998) and Siddall et al. (2001), using a combi¬ 
nation of nuclear 18S and mitochondrial COI 
sequences, found support for the argument that 
leeches and branchiobdellids, with the leech-like 
fish parasite Acanthobdella Grube, 1851, form a 
monophylum within the Oligochaeta close to the 
aquatic oligochaete family Lumbriculidae. 
Derivation of leeches from a lumbriculid-like 
ancestor had been suggested on morphological 
grounds by Michaelsen (1928-1932) and 
Brinkhurst & Nemec (1986). Martin (2001), 
using complete 18S rRNA gene sequences, and 
taking secondary structure into account, also 
included Euhirudinea Lukin, 1956 (true leeches) 
and Acanthobdellida Livanow, 1905 in the 
Oligochaeta. He suggested the Branchiobdellida 
Holt, 1963 via the Lumbriculidae as a possible 
link between the two assemblages; the exact posi¬ 
tion of Hirudinea and Branchiobdellida within 
oligochaetes remained unresolved. An extremely 
ancient radiation of polychaetes and emergence 
of (eu)clitellates was proposed. Rota et al. 
(2001), for the same gene, with and without 
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consideration of secondary structure, but omit¬ 
ting lumbriculids, confirmed the sister-group 
relationship between Acanthobdella and 
Hirudinea but found the branchiobdellids to be a 
more distant clade, paired with two polychaetes; 
the many polychaetes included appeared highly 
polyphyletic. 

Crassiclitellate 

AND MEGASCOLECID HYPOTHESES 
Oligochaetes s.s. are marine, freshwater and ter¬ 
restrial. Unlike leeches they do not include para¬ 
sitic species though two ectocommensal species 
on earthworms are known. With the exception of 
some earthworm-like genera, aquatic oligochaetes 
are usually small and are loosely termed “micro- 
driles”. They are characterized by a plesiomor- 
phic type of clitellum in which, like the 
epidermis from which it is derived, there is only a 
single layer of cells. Its simple structure and limit¬ 
ed ability to secrete nutrients into the cocoon 
correlate with the production of small numbers 
of large, yolky eggs. A major evolutionary innova¬ 
tion in earthworms (loosely termed “mega- 
driles”), in contrast, has been the development of 
a clitellum consisting of several layers of cells with 
the ability to secrete large proteinaceous reserves 
into the cocoon. Correlated with this, the eggs 
possess little yolk, are therefore small and produ¬ 
ced in large numbers (Jamieson 1992). In a mor- 
phocladistic analysis (Jamieson 1988), all families 
with multilayered clitella were found to form a 
single clade, named the Crassiclitellata Jamieson, 
1988. Thus acquisition of a multilayered clitel¬ 
lum was deduced to be a monophyletic event. 
However, Omodeo (2000) implied that a multi¬ 
layered clitellum has arisen more than once when 
he derived the Eudrilidae (with multilayered cli¬ 
tellum) from the Alluroididae Michaelsen, 1900 
(with clitellum consisting of a single layer of cells) 
independently of other earthworms, a familial 
relationship not supported morphocladistically. 
In the present study we test the monophyly of the 
Crassiclitellata using molecular data. 

It is not the aim of this paper to make a detailed 
examination of higher-level relationships within 
the Crassiclitellata. However, some analysis is 


made of division of the Crassiclitellata on mor- 
phocladistic evidence (Jamieson 1988) into two 
groups the Aquamegadrili Jamieson, 1988 and 
Terrimegadrili Jamieson, 1988. Aquamegadrili 
have an aquatic or semi-aquatic mode of life, and 
consisted of the families Sparganophilidae 
(Holarctic), Biwadrilidae Jamieson, 1971 
(Japan), Almidae (mostly warm tropics but 
including Criodrilus Hoffmeister, 1845, in the 
Mediterranean region, etc.) and Lutodrilidae 
(Southern Neartic). It is not unlikely that the 
aquamegadrile families, irrespective of mono- or 
polyphyly of the group, have always had an 
aquatic or amphibious existence. The remainder 
of the Crassiclitellata were predominantly terres¬ 
trial, hence the term Terrimegadrili. These 
consisted of the superfamilies Ocnerodriloidea 
Beddard, 1891, Eudriloidea Claus, 1880, 
Lumbricoidea Claus, 1876, and Megasco- 
lecoidea Rosa, 1891. The validity of recognizing 
the Aquamegadrili and Terrimegadrili is tested 
here from molecular data. 

The Lumbricoidea as redefined by Jamieson 
(1978, 1988) included the Lumbricidae 
(Holarctic), Komarekionidae (Nearctic), 
Glossoscolecidae (Neotropical), Microchaetidae 
(Ethiopian, South of the Kalahari), Hormo- 
gastridae (western Palaearctic, Tyrrhenian), and 
Ailoscolecidae Bouche, 1969 (Palaearctic). 
Whether the Kynotidae Jamieson, 1971 (Mala¬ 
gasy) should be assigned to the Aquamegadrili or 
Terrimegadrili was uncertain. Some further fami¬ 
lies have been added more recently (see Omodeo 
2000). The families Megascolecidae, Ocnero- 
drilidae and Eudrilidae had been tentatively 
included in the superfamily Megascolecoidea by 
Jamieson (1978, 1980). However, in the cladistic 
analysis (Jamieson 1988), the Eudrilidae (super¬ 
family Eudriloidea) and especially the Ocnero- 
drilidae (superfamily Ocnerodriloidea) occupied 
a basal position relative to the other terrimega- 
drile families. These classifications are evaluated 
below and, to anticipate, are all called into ques¬ 
tion. 

The largest, most speciose, earthworm family is 
the Megascolecidae for which a Pangean origin 
has been suggested (Jamieson 1981). They are 
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native in the Nearctic, Ethiopian, Oriental, 
Australian, eastern Palaearctic (China, Japan, 
Korea) and southern Neotropical regions, with 
Central America. In South America North of the 
Juramento-Salado River, megascolecids are repla¬ 
ced by the large family Glossoscolecidae. In the 
Ethiopian region, particularly in Tropical West 
and East Africa, the family Eudrilidae greatly 
outnumbers it in genera. Currently recognized 
subfamilies of the Megascolecidae are the Acan- 
thodrilinae Vejdowsky, 1884 and Megasco- 
lecinae Rosa, 1891, with or without the 
Ocnerodrilinae Beddard, 1891. Indigenous acan- 
thodriles are predominant in the earthworm fau¬ 
nas of the southern and eastern portions of North 
America, Mexico, Guatemala, southern South 
America, South Africa, New Zealand, New 
Caledonia, and parts of Australia. The native 
range of the Ocnerodrilidae includes the warmer 
parts of North and South America, the 
Dominican Republic, Africa, India and Burma 
(Jamieson 1981). 

Several, sometimes widely divergent, classifica¬ 
tions of these megascolecoid earthworms have 
been proposed since publication of Stephenson’s 
monograph of the Oligochaeta (1930) (see 
Michaelsen 1933; Pickford 1937; Omodeo 1958; 
Lee 1959; Gates 1959; Sims 1966, 1967; 
Jamieson 1971a-c, 1978, 1988). Particular atten¬ 
tion will be paid in the present study to the sys¬ 
tem of Gates (1959), supported by Sims (1966, 
1967), and that of Jamieson (1971a-c) as both of 
these systems are widely used. Detailed discussion 
of these alternative classifications may be found 
in the 1971 papers. 

The procedures for analysing the molecular data 
for representatives of the above taxa will now be 
considered. 

Combining data, likelihood models 

AND HYPOTHESIS TESTING 

Higher-level systematics is turning to congruence 
and combined power in multiple data-set ana¬ 
lyses as the most convincing source of phylogene¬ 
tic signal. This raises issues of how to combine 
the data and how to identify possible conflict. 
Krajewski et al. (1999) summed up the issues 


involved with integrating different data in terms 
of 1) historical heterogeneity (the data have diffe¬ 
rent histories) versus 2) dynamic heterogeneity 
(methods for extracting the phylogenetic signal 
are incompatible). As they are independent 
genomes, the nuclear 28S rDNA and the 
mtDNA data-sets are biologically independent 
sources of information (i.e. they could have histo¬ 
rically different genealogies) and so allow 
congruence as evidence. They may also have very 
different apparent sequence evolution patterns, 
being “dynamically” heterogeneous. To date 
most analyses have combined data (e.g., Flook et 
al. 1999; Crandall et al. 2000). However, lump¬ 
ing heterogeneous data produces a patently arti- 
factual model, compromising parameter 
estimation and interpretation of levels of support 
(Goldman 1993; Yang 1996). 

Real conflict (e.g., introgression, ancestral poly¬ 
morphism, paralogous loci) may be overlooked 
by tests of total character support (such as likeli¬ 
hood ratio tests). On the other hand lumping 
individual differences in consensus may overlook 
underlying similarities. Partition support analy¬ 
sis (Baker & DeSalle 1997) allows such hidden 
support and conflict to be identified. Hence, for 
likelihood analyses in addition to combined data 
methods we have chosen to combine the data by 
adding the likelihoods of each partition opti¬ 
mised individually (Edwards 1972; Adachi & 
Hasegawa 1992; Huelsenbeck & Bull 1996; 
Yang 1996; Wilgenbusch & De Queiroz 2000). 
We then compare individual contributions with 
partition support, extended to likelihood (Lee & 
Hugall in press). Summing the likelihoods of 
hypotheses (the “support surface”) from diffe¬ 
rent data-sets is the general likelihood framework 
for combining disparate sources of information, 
which in the case of phylogenetics could include 
non-sequence data, given appropriate models. 
Here we apply this approach to a phylogenetic 
study of nuclear and mitochondrial DNA 
sequences in clitellates, allowing a revision of the 
higher classification of the earthworm family 
Megascolecidae and permitting examination of 
the validity of the Crassiclitellata and the rela¬ 
tionships of these within the Clitellata. 
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Table 1. — Taxa sequenced for this study. Classification according to Jamieson (1971a-c, 1988). * Lumbricus terrestris Linnaeus, 
1758 GenBank accession, f, for Figure 4 analysis, 16S is from Digaster lingi. Specimens collected by authors except where indicated. 


Taxon 

Location 

28S 

12S 

16S 

OUTGROUP 

Hirudinea Lamarck, 1818 

Haemadipsidae Blanchard, 1893 
Haemadipsidae Gen. sp. (terrestrial) 
Haemadipsidae Gen. sp. (freshwater) 

Mt Glorious, Queensland 

North Stradbroke Island, Queensland 

AY101557 

AY101558 



Branchiobdellidae Odier, 1823 
Cambarincola pamelae Holt, 1984 
Xironodrilus formosus Ellis, 1919 

USA (R. O. Brinkhurst) 

USA (R. O. Brinkhurst) 

AF406601 

AF406600 



Oligochaeta Grube, 1850 

Lumbriculidae Vejdovsky, 1884 
Lamprodrilus Michaelsen, 1901 sp. 
Lumbriculus variegatus (Muller, 1774) 
Rhynchelmis brachycephala Michaelsen, 
1891 

Tenagodrilus musculus Eckroth 
& Brinkhurst, 1996 

Enchytraeidae Vejdovsky, 1879 

Achaeta bohemica Vejdovsky, 1879 
Enchytraeus albidus Henle, 1837 
Fridericia bisetosa (Levinsen, 1884) 

Lake Baikal (P. Martin) 

Cultured, Brisbane 

Lake Baikal (P. Martin) 

USA (R. O. Brinkhurst) 

Arezzo, Italy (E. Rota) 

Brisbane, Queensland 

Arezzo, Italy (E. Rota) 

AF406592 

AF406594 

AF406593 

AF406591 

AF406595 

AF406597 

AF406596 



Tubificidae Vejdovsky, 1884 

Branchiura sowerbyi Beddard, 1892 

Cultured, Australia 

AY101559 



Naididae Ehrenberg, 1831 

Dero Oken, 1815 (Aulophorus 

Schmarda, 1861) sp. 

Haplotaxidae Michaelsen, 1900 
Haplotaxis Hoffmeister, 1843 sp. 
Haplotaxis gordioides (Hartmann, 1821) 

Cultured, Queensland 

Logan River USA (R. O. Brinkhurst) 
Trezzotinella, Italy (E. Rota) 

AF406598 

AF406599 

AF406602 



Sparganophilidae Michaelsen, 1918 
Sparganophilus tamesis Benham, 1892 

Missouri, USA 

AY101566 



Lutodrilidae McMahan, 1976 

Lutodrilus multivesiculatus McMahan, 
1976 

Louisiana, USA 

AY101567 



Almidae Duboscq, 1902 

Criodrilus lacuum Hoffmeister, 1845 

Algeria (P. Omodeo) 

AY048492 

AY101545 


Ocnerodrilidae Beddard, 1891 

Eukerria saltensis (Beddard, 1895) 

Cultured, Brisbane, Queensland 

AY048496 

AY101546 

AF406590 

Eudrilidae Claus, 1880 

Eudrilus eugeniae (Kinberg, 1867) 

Cultured, Brisbane, Queensland 

AY101568 

AY048471 


Komarekionidae Gates, 1974 
Komarekiona eatoni Gates, 1974 

Kentucky, USA 

AY101569 



Microchaetidae Michaelsen, 1900 
Microchaetidae Gen. sp. 

Grahamstown, South Africa 
(A. Hodgson) 

AY101570 



Glossoscolecidae Michaelsen, 1900 
Glossoscolecidae Gen. sp. 

Pontoscolex corethrurus (Muller, 1856) 

Prov. Pichincha, Ecuador 

North Queensland 

AY048507 

AY101571 



Lumbricidae Claus, 1876 

Eisenia fetida (Savigny, 1826) 
Lumbricidae Gen. sp. 

Paris, France 

Samford, Queensland 

AY048508 

AY048498 

AY048472 

U24570* 

Hormogastridae Michaelsen, 1900 
(As Hormogastrinae) 

Hormogaster redii Rosa, 1887 

Molara Is., Sardinia, Italy (P. Omodeo) 

AY048506 
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Taxon Location 28S 12S 16S 

INGROUP 

Oligochaeta Grube, 1850 


Megascolecidae Rosa, 1891 
Acanthodrilinae Vejdovsky, 1884 
Diplocardia longiseta Murchie, 1965 
Diplotrema acropetra Jamieson, 1997 

Diplotrema Spencer, 1900 sp. 

Neodiplotrema altanmoui Jamieson, 1997 
Rhododrilus glandifera Jamieson, 1995 

Megascolecinae Rosa, 1891 
Perionychini Jamieson, 1971 
cf. Argilophilus Eisen, 1893 sp. 
Diporochaeta Beddard, 1890 sp. 
Fletcherodrilus unicus (Fletcher, 1889) 
Fletcherodrilus unicus (Fletcher, 1889) 
Fletcherodrilus fasciatus (Fletcher, 1890) 
Fletcherodrilus sigillatus 
(Michaelsen, 1916) 

Heteroporodrilus Jamieson, 1970 sp. 
Diporochaeta cf. kershawi 
(Jamieson, 1974) 

Perionyx excavatus Perrier, 1872 
Pontodrilus litoralis (Grube, 1855) 

Terrisswaikerius athertonensis 
(Michaelsen, 1916) 

Terrisswaikerius grandis (Spencer, 1900) 
Terrisswaikerius kuranda (Jamieson, 1976) 
Terrisswaikerius millaamillaa (Jamieson, 
1976) 

Terrisswaikerius phalacrus (Michaelsen, 
1916 ) 

Terrisswaikerius phalacrus (Michaelsen, 
1916) 

Terrisswaikerius windsori Jamieson, 1995 
Dichogastrini Jamieson, 1971 
Dichogaster Beddard, 1888 sp. 
(sexprostatic) 

Dichogaster saliens (Beddard, 1893) 
Didymogaster sylvaticus Fletcher, 1886 
Digaster anomala Jamieson, 1970 
Digaster lingi Jamieson, 1995 
Megascolecini Jamieson, 1971 
Amynthas rodericensis (Grube, 1879) 
Begemius queenslandicus Easton, 1982 

Propheretima hugalli Jamieson, 1995 
Spenceriella cormieri Jamieson 
& Wampler, 1979 
Spenceriella Michaelsen, 1907 sp. 


Number of sequences 


Iowa, USA 

Rocky Peak, Starke Station, 
Queensland 

Souita Falls, Atherton Tableland, 
Queensland 

Altanmoui Range, Queensland 
Wooroonooran National Park, 
Queensland 


Oregon, USA 

Near Cradle Mt, Tasmania 

Brisbane, Queensland 

Broken River, Eungella, Queensland 

Lamington National Park, Queensland 

Pelling’s, Atherton Tableland, 

Queensland 

Brisbane, Queensland 

Tasmania (R. Blakemore) 

Cultured, Brisbane, Queensland 
Bush Bay, Western Australia 
(M. Harvey) 

Carbine Tableland, Queensland 

Lamb Range, Queensland 
Ebony Creek Rd, Queensland 
Souita Falls, Atherton Tableland, 
Queensland 

Souita Falls, Queensland 

Kennedy Falls, Queensland 

Windsor Tableland, Queensland 

Plateau Boucher, Martinique 

Samford, Queensland 
Hornsby, New South Wales 
Brisbane, Queensland 
Binna Burra, Queensland 

Brisbane, Queensland 
Mt Lewis, Carbine Tableland, 
Queensland 

Boat Harbour, Lismore, Queensland 
O’Reilly’s, Border Ranges. 
Queensland 
Boat Harbour, Lismore, 

New South Wales 


AY101572 


AY101573 

AY048469 

AF406568 

AY048477 

AY048466 

AF406570 

AY101574 

AY048468 

AF406569 


AY048467 


AY101575 



AY048479 

AY048460 

AF406574 

AY048474 

AY101547 


AY101565 

AY048423 

AF406558 

AY048503 

AY101548 


AY048473 

AY048425 

AF406588 

AY048497 

AY101553 

AF406579 

AY048484 

AY048461 

AF406567 

AY048499 

AY048456 

AF406582 

AY101576 

AY048463 

AF406586 

AY048504 

AY048453 

AF406585 

AY048495 

AY101551 

AF406566 

AY048478 

AY048432 

AF406581 

AY048476 

AY048452 

AF406565 

AY048490 

AY048450 

AF406577 

AY101556 

AY048449 


AY048494 

AY101552 

AF406587 

AY101555 

AY101549 

AF406571 

AY048493 

AY048470 

AF406573 

AY048491 

AY101554 

AF406575 

AY048480 

AY048462 

t 

AY101561 

AY048459 

AF406583 

AY101562 



AY101563 

AY048465 

AF406578 

AY048505 

AY101550 


AY101564 

AY048458 

AF406589 

AY048475 

AY048454 

AF406572 

59 

34 

28 
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Table 2. — Primers employed. Position numbers are from the Lumbricus terrestris mtDNA (accession number U24570). 


Primer 

3’ position 

Sequence 5’ to 3’ 

Gene 

Author 

12SE1 

10538 

AAAACATGGATTAGATACCCRYCTAT 

12S rRNAL 

present work 

12SH 

10919 

ACCTACTTTGTTACGACTTATCT 

12S rRNAH 

present work 

16SAR 

11639 

CGCCTGTTTATCAAAAACAT 

16S rRNAL 

Palumbi (1996) 

16SBR 

12120 

CCGGTCTGAACTCAGATCACGT 

16S rRNAH 

Palumbi (1996) 

Cl’ 


ACCCGCTGAATTTAAGCAT 

28S rRNA “coding” 

Tillier 

D2 


TCCGTGTTTCAAGACGG 

28S rRNA 

Tillier 

C2 


TGAACTCTCTCTT C AAAGTT CTTTT C 

28S rRNA 

Tillier 

C2’ 


GAAAAG AACTTTG RARAG AG AGT 

28S rRNA 

Tillier 


MATERIALS AND METHODS 

The list of the species used in this analysis is pre¬ 
sented in Table 1. The list follows the classification 
of Jamieson (1971a-c, 1988). Voucher specimens 
are lodged in the Queensland Museum Brisbane, 
Australia. 

Molecular data 

One fresh specimen of Fletcherodrilus sigillatus 
mtDNA was subjected to differential centrifugation 
and CsCl density gradient ultra centrifugation for 
DNA purification. All other specimens were preser¬ 
ved in ethanol. For these, DNA was purified by SDS 
detergent based lysis in the presence of Proteinase 
K followed by extraction with phenol/chloroform 
and ethanol precipitation, or by digesting tissue at 
55°C in 5 % chelex water and 5 ml of proteinase 
K (10 mg/ml). Some of the DNA extractions were 
performed using CTAB (Cetyltrimethylamonium 
bromide) (Winnepenninckx et al. 1993). 

The primers used are indicated in Table 2. For 
28S genes, universal primers located in the Cl 
domain and just 3’ to the end of the D2 domain, 
designed by A. Tillier, amplified a c. 770 bp frag¬ 
ment covering most of the Cl, and all of the Dl, 
C2, and D2 domains. A 450 bp fragment of the 
mtDNA 12S was amplified and sequenced using 
modified versions of the standard 12S primers 
(Palumbi 1996) (developed using Lumbricus 
terrestris, GenBank accession number U24570). 
The primer 12SE1 was designed around the tra¬ 
ditional 12S1 site, while the primer 12SH was 
placed 31 bp 3’ to the 12S2 site (see Table 2). 
MtDNA 16S was amplified and sequenced using 


standard 16S rDNA primers 16SAR and 16SBR 
(Palumbi 1996). Nuclear 28S PCR (polymerase 
chain reaction) included 3.5 % DMSO. Most 
PCR products were gel purified (some directly by 
addition of Exonuclease and Shrimp alkaline 
phosphatase) and sequenced using ABI dye ter¬ 
minator automated sequencing for, in most cases, 
both strands. Some were manually sequenced. 
The taxa analysed in the study are drawn from a 
larger set (unpublished) which includes multiple 
representatives of eight of the OTUs (operational 
taxonomic units) presented here: all these within- 
species gene comparisons are congruent. 
Nucleotide sequences were first aligned using 
CLUSTALW (Thompson et al. 1994). These were 
then modified according to accepted secondary 
structure models for 12S (Hickson etal. 1996) and 
16S (see De Rijk et al. 1999). Across the 28S 
rDNA sequence there were several regions of ambi¬ 
guous alignment within the D1 and D2 variable 
domains. In the Crassiclitellata and Megascolecidae 
28S data-set, three regions were removed: one in 
the Dl domain (10 bp between the B13 and B13 1 
stems) and two in the D2 domain (10 and 12 bp), 
leaving 622 sites. The higher level oligochaete 28S 
data-set was more difficult to align and more sec¬ 
tions of the D2 expansion region were removed 
resulting in 549 sites (alignment available from 
authors). For 12S rDNA, four loop regions of 
variable length between stem regions 40’-39’, 42- 
42’, 47-47’ and 48-48’ totalling 30-50 bp were 
removed, leaving 315 sites. Using the nomenclatu¬ 
re of De Rijk etal. (1999) small sections of the E25 
and G3 loops were removed from the 16S rDNA 
alignment, totalling 10-20 bp, leaving 435 sites. 
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0 H-1-1-1-1 

A C G T 

Fig. 1. — Average and range of base content of variable sites across all available taxa (59 for 28S, 25 for mtDNA genes). Left pair for 
each base: the 28S rDNA for Megascolecidae and outgroup taxa. Right pair for each base: megascolecid 12S rDNA and 16S rDNA. 


Phylogenetic analyses 
PAUP* 4.0b3a (Swofford 2000) was used for 
neighbour joining, parsimony and likelihood 
analyses, bootstrapping (BS), tree topology dis¬ 
tances, base content, partition homogeneity tests, 
Bremer values, and branch length and pairwise 
divergence estimates. PHYLIP 3.5 (Felsenstein 
1995) was used to generate certain bootstrap 
resamplings and trees from base-content dis¬ 
tances. Random and various constraint trees were 
generated in MacClade 3.07 (Maddison & 
Maddison 1995). Parametric data-sets were gene¬ 
rated with 100 (sometimes 200) replicates using 
SeqGen 1.1 (Rambaut & Grassley 1997) with 
model parameters drawn from maximum likeli¬ 
hood (ML) estimates. Modeltest 3.0 of Posada & 
Crandall (1998) provided information on likeli¬ 
hood model choice. Maximum parsimony (MP) 
analyses used heuristic search with tree bisection 
reconnection (TBR) and stepwise random 
sequence addition. ML model parameters and 
trees were optimised by a heuristic procedure of 
successive searches and re-optimisations using 
SPR (subtree pruning and regrafting) (Swofford 
in PAUP* release notes; Huelsenbeck 1998). 
Throughout we refer to relative log-likelihood 
(AlnL) as the difference in -InL between topolo¬ 
gies, typically a ML tree versus others. 


Assessing congruence between mitochondrial and 
nuclear genes 

Apparent conflict between the nuclear and mito¬ 
chondrial genes was explored with the character 
based partition homogeneity test (ILD [incon¬ 
gruence length difference]; Farris et al. 1995; for 
excluding uninformative sites, see Lee 2001), 
likelihood ratio tests, the parametric test of 
Huelsenbeck & Bull (1996), and partition sup¬ 
port using both MP and ML (Baker & DeSalle 
1997; Lee & Hugall in press). Partition support 
values, both Bremer and relative InL, were calcu¬ 
lated in PAUP* using reverse constraint trees 
(best tree not containing a specified clade). 
Likelihood is partitioned using site -InL values. 

Combining mtDNA and nuclear gene sequence 
data 

Complex models in likelihood procedures are 
designed to accommodate heterogeneity among 
sites. Although variation among the mtDNA 
genes concerned here is likely to be not much 
more than that within a gene, the differences bet¬ 
ween the mtDNA and nuclear genes are much 
more striking. As the sequence statistical analysis 
indicates that the mtDNA and nuclear genes are 
substantially different in their basic characteris¬ 
tics (Fig. 1; Table 3), we approach the issue of 
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dynamic heterogeneity by optimising models and 
parameters for each data partition but inferring a 
phylogeny from the total score. Within the 
mtDNA data, as the 12S and 16S sequences are 
sufficiently similar in character, neither dynamic 
nor historical heterogeneity is an issue (notwith¬ 
standing cryptic pseudogenes): they have been 
pooled, in favour of one model, in preference to 
splitting up the number of variable sites which 
might result in over specified models for the 
amount of data. 

While combining data in MP is straightforward 
- all parts have the same model - for likelihood 
analyses we have chosen to combine the data in 
two ways: the first, called the COMBO model, 
optimises the model to the combined data; the 
second, called the SUM model, adds the likeli¬ 
hoods of each partition optimised individually 
(Edwards 1972; Wilgenbusch & De Queiroz 
2000). To provide a tree search space for the 
SUM method we evaluated 14586 trees drawn 
from 1000 near parsimonious reverse constraint 
trees for each node in the combined data tree 
(Fig. 4). 

Partition support analysis 

In addition to summary statistics of the fit or 
otherwise of each data-set to the other, hidden 
support or conflict can be apportioned to each 
node for each partition (Baker & DeSalle 1997) 
using the reverse constraint tree set. Here we can 
extend this by comparing the mtDNA and 28S 
components in the combined analysis (COMBO) 
and the components in the SUM method. The 
partitioned likelihood scores in the combined 
data model are sums of the site -InLs. 

Assessing the effect of model design on tree support 
space 

We determined a suitable nucleotide substitution 
model on the basis of the relative likelihood 
scores of the MP bootstrap consensus tree as 
parameters were added to the model, following 
the method of Posada & Crandall (1998) but 
interpreting the AlnL conservatively in accor¬ 
dance with the observations of Takahashi & Nei 
(2000) and Yang (1997). 


Different models and parameters values can 
change the absolute likelihood substantially but 
the relative log likelihood amongst competing 
trees - the likelihood support surface - may be 
less sensitive to the details of the model over the 
range of plausible trees (Yang et al. 1995). We 
have investigated the effect of different models by 
comparing the relative log likelihood (AlnL) 
among trees, using the 25 megascolecid taxa for 
which we have both 28S and mtDNA data-sets. 
This is presented in Figure 6 as the relative InL to 
the best SUM model tree (Fig. 4) for a subset of 
the near parsimonious reverse constraint trees 
used in the SUM method search set. In particular 
we compare the SUM versus the COMBO 
models. These support surface reverse constraint 
trees cover a wide range of likelihoods with small 
AlnL between each. 

Hypothesis testing 

Specific hypotheses were investigated with likeli¬ 
hood ratio tests comparing the best tree with the 
best tree constrained to that hypothesis. As there 
is a number of competing tests, and some uncer¬ 
tainty over their qualities, we used the distribu¬ 
tion of relative InL to the ML tree among BS 
pseudoreplicates as the basis for the likelihood 
ratio tests of Kishino & Hasegawa (1989) and 
Shimodaira & Hasegawa (1999) as described in 
Goldman et al. (2000), and for the estimated 
confidence from expected likelihood weight (c) of 
Strimmer & Rambaut (2002). These methods 
were applied to the best trees consistent with the 
hypotheses listed in the results and in Table 5. 
The estimated likelihood weight confidence was 
also applied to the COMBO model reverse 
constraint trees (Table 4). The distribution of 
AlnL was calculated from 200 PHYLIP generated 
bootstrap pseudoreplicates, using fixed model 
parameters. 

Parametric tests 

Parametric bootstrapping is a statistical tool for 
producing independent replicates of a study based 
on parameters estimated from a unique data-set 
(Huelsenbeck & Hillis 1996). Parametric methods 
explore the variance in the explicit model by crea- 
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Table 3. — GTR-r model parameter estimates for the 25 taxa of Megascolecid 28S, mtDNA, and combined data-sets. Alpha = 
0.196/0.295/0.254. 28S to the left, mtDNA middle, and combined data right (T-G rate is normalised to 1). Four discrete rate catego¬ 
ries for gamma, empirical base content. 



C 

G 

T 

base content 

A 

1.40/3.95/2.06 

1.53/11.09/4.92 

1.61/10.41/8.73 

0.15/0.36/0.27 

C 


0.37/0.96/0.99 

4.33/28.35/13.20 

0.32/0.18/0.25 

G 



1 

0.38/0.19/0.27 

T 




0.15/0.27/0.21 


ting data-sets based on that model. It therefore is a 
question of how much faith one puts in the model 
to be a reasonable measure of the natural variance 
or uncertainty in the real data (Goldman 1993; 
Yang et al. 1995; Huelsenbeck & Hillis 1996). 
Parametric methods are however ideal for compa¬ 
ring models (Goldman 1993; Huelsenbeck & 
Bull 1996; Huelsenbeck & Rannala 1998). We 
have used both facets of parametric methods. In 
assessing reliability of phylogenetic inference we 
use the approach developed by Swofford et al. 
(1996), illustrated in Bromham & Degnan 
(1999), and which Goldman et al. (2000) dubbed 
the SOWH (Swofford, Olsen, Waddell and 
Hillis) test: can a result consistent with one hypo¬ 
thesis be obtained by chance out of parametric 
data-sets built on the best observed data model 
that is not consistent with that hypothesis? 

Given the nature of our taxon and sequence sam¬ 
pling, where, because of logistic and other 
constraints, only subsets of taxa have the full 
compliment of sequence data, we approach the 
analyses in a hierarchical manner, establishing the 
existence of higher groups so that subsequent 
analyses of within group relationships can be 
conducted using subsets of taxa that have larger 
suites of sequences. The phylogenetic analyses 
here comprise a 55 taxon, 28S alignment of 
549 sites; combined 28S and mtDNA for 42 spe¬ 
cies in which a given species may be represented 
by one, two or three genes; and a more detailed 
study of 25 species for which all three genes have 
been sequenced (see Table 1). This latter includes 
622 28S sites, 749 combined 12S and 16S mito¬ 
chondrial sites, with hypothesis testing. 


RESULTS 

Patterns of variation in 28S and mtDNA 
data 

Across all (55) taxa, 210 of 622 sites are variable 
in the megascolecid 28S rDNA, 352 of 549 in 
the higher (suprafamilial) oligochaete 28S, and 
403 out of 749 in the mtDNA thus amounting 
to high levels of observed differences. The 
nuclear and mitochondrial genes show substan¬ 
tially different base contents while the mtDNA 
genes are more similar to each other (Fig. 1). The 
28S rDNA is CG rich while the mitochondrial 
genes are AT rich. The mitochondrial genes show 
a larger range in base content, as might be expect¬ 
ed with higher levels of divergence. The higher 
oligochaete 28S data-set (Fig. 2) shows signifi¬ 
cant base composition variation (p < 0.001 
PAUP* test). 

Using MP trees, large gains in likelihood are made 
allowing for site rate heterogeneity with the 
gamma parameter (T) and also by allowing for 
substitution rate heterogeneity using the 6-way 
general time reversible (GTR) model (Yang 1994). 
Some gain is made allowing for invariant sites in 
the 25 Megascolecid taxa set (2.9 AlnL: significant 
by Modeltest 3.0). We have interpreted the AlnL 
conservatively and kept to the GTR-T model as 
adequate, in accordance with the observations of 
Takahashi & Nei (2000) and Yang (1997). 
Despite the opposite bias in base content both 
gene systems show a high estimated T-C rate and 
low G-C rate (Table 3). Both show high levels of 
site rate heterogeneity (alpha < 0.3), the 28S the 
more so. The estimated parameters are noticeably 
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100 

100 


100 


100 


100 


98 


94 


Terrisswalkerius grandis 
Digaster lingi 
Perionyx excavatus 
T. windsori 
T. kuranda 

Begemius queenslandicus 
Amynthas rodericensis 
Pontodrilus litoralis 
Digaster anomala 
T. millaamillaa 
T. phalacrus 
Diporochaeta sp. 

Argilophilin sp. 

Fletcherodrilus sigillatus 
F. urticus 

Heteroporodrilus sp. 
Spenceriella cormieri 
T. athertonensis 
Propheretima hugalli 
Didymogaster sylvaticus 
Diporochaeta cf. kershawi 
Spenceriella sp. 

Diplotrema acropetra 
Neodiplotrema altanmoui 
Diplotrema sp. 

Dichogaster sp. 

Dichogaster sp. (sexprostatic) 
Diploca rdia longiseta 


Megascolecidae 


Oligochaeta 

Crassiclitellata 


Eukerria saltensis 
Sparganophilus tamensis 
Komarekiona eatoni 
Criodrilus lacuum 
Lutodrilus multivesiculatus 
Hormogaster redii 
Lumbricid spT 
Eisenia feti da 
Microchaetid sp. 

Eudrilus eugeniae 
Glossoscolecid sp. 
Pontoscolex corethrurus 


- Ocnerodrilidae 

- Sparganophilidae 

- Komarekionidae 

- Almidae 

- Lutodrilidae 

- Hormogastridae 

Lumbricidae 

- Microchaetidae 

- Eudrilidae 

Glossoscolecidae 


Rhynchelmis brachycephala 
Lamprodrilus sp. 
Tenagodrilus musculus 
Lumbriculus variegatus 


Haemadipsid sp. 
Haemadipsid sp. 


Xironodrilus formosus 
Cambaricola pamelae 


Haplotaxis sp. 

Dero (Aulophorus) sp. 
Branchiura sowerbyi 
Achaeta bohemica 
Fridericia bisetosa 
Enchytraeus albidus_ 


-1- 

Oligochaeta 

Lumbriculidae 

I 


Hirudinida 


Branchiobdellida 


- Haplotaxidae 


-Tubificidae Oligochaeta 


Enchytraeidae 

J 


Haplotaxis gordioides - Haplotaxidae 


Fig. 2. — 28S rDNA maximum parsimony majority rule consensus of 1000 bootstrap resamples, five random sequence additions 
each with tree bisection reconnection swapping. Basal polytomy rooted with Haplotaxis gordioides (Hartmann, 1821). Maximum 
parsimony bootstraps above node, maximum likelihood bootstrap values below (200 resamplings from reduced 36 taxa set using 
only nine Megascolecidae). Those underlined were not recovered in the majority rule. Family and some higher level groups indicated 
(see Table 1 for details). 
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different from two-parameter models that distin¬ 
guish transition/transversion rates only. The com¬ 
bined data model is a compromise between the 
two, resulting in much more even base contents 
and intermediate rate values (Table 3). Using the 
optimized GTR-T models estimated divergences 
(using the 25 taxa with all genes) range up to 
more than 0.30 for the 28S rDNA and to over 
0.70 for the mtDNA (a substantial increase over 
observed difference), with a wide range of relative 
divergence between genes, 3-4 fold among taxa of 
similar 28S divergence. Thus, the data probably 
suffer saturation effects in the mtDNA relative to 
the 28S and substantial base content variation. 

High level oligochaete phylogenetic 
INFERENCE USING 28S 

Initial analysis of the oligochaete 28S data-set 
contained 55 taxa including representatives of 
leeches, lumbriculids, branchiobdellids, enchy- 
traeids and haplotaxids. Preliminary investiga¬ 
tions indicated that addition of polychaete 
sequences provided little insight into the deeper 
root and was dependent on alignment. 
Considering the diversity of polychaetes in mole¬ 
cular data, which fail to recover monophyly (e.g., 
18S see Winnepenninckx et al. 1998; elongation 
factor 1-alpha see Kojima 1998) or shows extre¬ 
me polyphyly (Rota et al. 2001), the most appro¬ 
priate outgroup of this data-set is uncertain but, 
by historical precedent, networks are presented as 
being been rooted with Haplotaxis species. 

Figure 2 shows the result of a MP bootstrap with 
1000 resamplings for the 549 bp of 28S, with the 
ML bootstrapping using GTR-T model below 
(reducing the number of taxa to 36 by represent¬ 
ing the Megascolecidae with nine taxa across the 
Acanthodrilinae and Megascolecinae). Key 
conclusions from this are: the first two groupings 
(Crassiclitellata and the Megascolecidae + 
Eukerria saltensis [= Megascolecoidea, 94 %]) are 
statistically robust and, considering the level of 
our sampling, the Crassiclitellata and the 
Megascolecidae are each a monophyletic group. 

A by-product is support for the published finding 
from 18S and COI for the 1) Lumbriculidae; 
2) Enchytraeidae; 3) Naididae-Tubificidae; and 


4) a leech-branchiobdellid clade. This last has 
been controversial owing to the concern that 
molecular artifacts (under the rubric of long 
branch attraction) undermine confidence in the 
result. Here, there is significant base composition 
variation in the 28S (p < 0.001 PAUP* test and 
see Fig. 1), in particular the leeches and the bran¬ 
chiobdellids being distinct from the others. 
Bootstrap support is fairly high in MP (76 %) 
but not sustained in ML (16 %), and the leech- 
branchiobdellid grouping is conspicuously absent 
using LogDet, a method that is supposed to com¬ 
pensate for nonstationarity (Lockhart et al. 
1994). Further, parametric data-sets made to 
trees that do not contain this clade, when analys¬ 
ed with MP, group them at high bootstrap fre¬ 
quency (> 70 %; following the method of 
Huelsenbeck 1998, results not shown); we there¬ 
fore suspect that MP support is artifactually high. 
Conversely, the low ML support may be due to 
long branch repulsion effects (Pol & Siddall 
2001). Notwithstanding limited outgroups, that 
the 28S is tightly linked to the 18S and that it 
suffers a molecular bias that could confound 
methodology, the results here may be seen as 
some corroboration of this proposal. 

The present study considers chiefly relationships 
in the Crassiclitellata and the Megascolecidae but 
some comment on wider clitellate relationships is 
given in the Discussion below. 

Crassiclitellate taxa included in the morphocla- 
distic analysis (Jamieson 1988) were the aquame- 
gadrile families (with superfamilies there 
recognized) Sparganophilidae (Sparganophiloi- 
dea Michaelsen, 1918), Biwadrilidae (Biwadri- 
loidea Jamieson, 1971), Almidae (including 
Criodrilus), Lutodrilidae (both Almoidea Dubosq, 
1902, sensu Jamieson 1988) and, tentatively 
included, Kynotidae (new familial status for 
Kynotinae Jamieson, 1971), and the terrimega- 
drile families Ocnerodrilidae (Ocnerodriloidea), 
Eudrilidae (Eudriloidea), Microchaetidae, 
Hormogastridae, Glossoscolecidae, and Lumbri- 
cidae (all Lumbricoidea) and the Megascolecidae 
(Megascolecoidea). In the present molecular ana¬ 
lyses, all but the Biwadrilidae and Kynotidae are 
represented. 
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Separate and combined 28S rDNA 
and mtDNA phylogenetic analysis 

OF THE MEGASCOLECIDAE 

Here we attempt to give greater resolution among 
the Megascolecidae by retrieving more sites 
within the 28S data by eliminating most out¬ 
group taxa and adding combined 12S and 16S 
mitochondrial rDNA sequences. Figure 3 shows 
a strict consensus MP tree, with bootstrap values 
added from a separate analysis, for 42 species of 
Megascolecidae and outgroup families for 
combined 28S, 12S and 16S gene sequences. 
These make a useful comparison with the 28S 
only analysis (Fig. 2) and subsequent analyses of 
the 25 taxa available with all data but only two 
outgroup families. Despite the heterogeneity of 
the number of genes sampled, the tree in Figure 3 
is highly informative and largely consistent with 
other trees for reduced taxon sets with uniform 
gene samples (see below). 

As previously noted, we restrict the main mixed 
gene analyses (Figs 4; 5; Table 4) to 25 taxa for 
which we have all three genes: 622 sites of 28S 
and 749 sites of combined 12S and 16S mito¬ 
chondrial rDNA sequence. Given the not unrea¬ 
sonable possibilities of dynamic and historical 
heterogeneity, in the first instance we analyse the 
two data partitions separately, then compare this 
with combined analysis results. These analyses 
are performed with both ML and MP methods to 
compare the effect of allowing for complex pro¬ 
cess models. The likelihood model parameters for 
the separate and combined analyses are shown in 
Table 3. 

Incongruence and compatibility 
The ML trees for the two partitions conflict in 
more than a few nodes so that the semi-strict 
consensus of the trees (Fig. 5A) loses much reso¬ 
lution. Neither tree has many strongly supported 
nodes so that majority rule bootstrap trees (not 
shown) also have low resolution, but where the 
trees differ, all but one bootstrap value are 
< 50%. Maximum parsimony produces much the 
same results (not shown). The character based 
partition homogeneity test (ILD) is borderline 
insignificant at p = 0.066. This could also reflect 


the substantially different characteristics of the 
28S and mtDNA data (Dolphin et al. 2000). 
However, likelihood ratio tests, both parametric 
and non parametric, also record apparent phylo¬ 
genetic incongruence: the optimal ML tree for 
each data partition is incompatible with the other 
by the SH test (54.3/86.9 AlnL, 28S/mtDNA, 
p < 0.001) but not with the combined data tree 
(24.6/5.7 AlnL, 28S/mtDNA, p > 0.1); the parti¬ 
tions are also significantly incongruous (p < 0.01) 
according to the parametric test of Huelsenbeck 
& Bull (1996). Despite evidence of incon¬ 
gruence, the two partitions combine to produce a 
more robust result. In addition to the congruence 
in the consensus tree (Fig. 5A), this can be seen 
in the average Bremer values of the best tree for 
each partition as compared with that for the com¬ 
bined data. Thus, the combined data value is 
greater than the sum of the partitions, indicating 
hidden support (128 vs 117), and there is also an 
increase in bootstrap levels (not shown). 

Combining support versus combining data 
We compare two methods of combining informa¬ 
tion in the 28S and mtDNA data: combining the 
data and analysing with a single model - the 
COMBO method; and combining the likelihoods 
of each partition using separate models - the 
SUM method. For the latter we evaluated 14586 
unique topologies drawn from 1000 near parsi¬ 
monious reverse constraint trees for each node in 
Figure 4, to represent the likelihood “support sur¬ 
face”. At least in this case the combined data ML 
and the MP methods are close enough to provide 
the essential candidate topologies, which contain 
all the combined data ML best reverse constraint 
trees. It therefore represents a reasonable search 
space for the absolute best SUM model trees. The 
likelihoods for each of these topologies were cal¬ 
culated for both SUM and COMBO models. 
Table 4 tabulates various measures of support for 
each of these nodes: the partitioned likelihood 
support; the analogous MP Bremer values; MP 
and ML bootstrap values; and, in addition, the 
Strimmer & Rambaut (2002) expected likelihood 
weight estimated confidence (c) for the set of ML 
and reverse constraint trees. 
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Fletcherodrilus unicus 3 
F. unicus 1 
F. fascist us 2 
F. sigillatus 3 

Terrisswalkerius phalacrus 3 
T. phalacrus 2 
T. millaamillaa 3 
T. kuranda 3 
T. grandis 3 
T. windsori 3 
T. athertonensis 3 
Didymogaster sylvaticus 3 
Diporochaeta sp. 3 
Argilophilin sp. 1 

Digaster anomala 3 Megascolecidae 

Digaster lingi 2 

Diporochaeta cf. kershawi 3 

Spenceriella sp. 3 

Propheretima hugalli 2 

Spenceriella cormieri 3 

Heteroporodrilus sp. 3 

Begemius queenslandicus 3 

Perionyx excavatus 3 

Amynthas rodericensis 1 

Pontodrilus litoralis 3 

DJp'otrema sp 3 Acantho'drilinae 

Rhododrilus glandifera 1 (as here def ined) 
Diplotrema acropetra 3 
Neodiplotrema altanmoui 3 
Dichogaster sp. 3 
Dichogaster sp. (sexprostatic) 3 

Diplocardia longiseta 1 - 

Eukerria saltensis 3 - Ocnerodrilidae 

Sparganophilus tamensls 1 
Komarekiona eatoni 1 
Criodrilus lacuum 2 
Lutodrilus multivesiculatus 1 


Hormogaster redii 1 
Lumbricus terrestris 3 
Eisenia fetida 1 



Lumbricidae 


Microchaetid sp. 1 


Eudrilus eugeniae 2 


Glossoscolecid sp. 1 Glossoscolecidae 

Pontoscolex corethrurus 1_ 


Fig. 3. — Strict consensus of six maximum parsimony trees using the combined data. Taxa with one, two or three genes indicated. 
Heuristic search, 1371 sites unweighted 50 random additions tree bisection reconnection with steepest decent, with bootstrap con¬ 
sensus values from 500 resamplings (heuristic search, unweighted, five random additions, tree bisection reconnection with steepest 
descent). 1, 28S (12S only for Rhododrilus glandifera Jamieson, 1995); 2, 28S + 12S; 3, 28S + 12S + 16S. 


The SUM model estimated best tree (Fig. 4) is 
the same as one of two equally most parsimo¬ 
nious trees, the other being 11.09 InL units away 
according to the SUM model. The COMBO 


model best tree differs in two nodes: the posi¬ 
tions of Eukerria saltensis and of Pontodrilus lito¬ 
ralis, at 0.28 AlnL according to the COMBO 
model. 
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Fletcherodrilus sigillatus 
F. unicus 

Terrisswalkerius millaamillaa 
T. phalacrus 
T. kuranda 
T. grandis 
T. windsori 
T. athertonensis 
Didymogaster sylvaticus 
Spenceriella sp. 

Spenceriella comieri 
Heteroporodrilus sp. 
Diporochaeta sp. 

Digaster anomala 
Diporochaeta cf. kershawi 
Begemius queenslandicus 
Perionyx excavatus 
Pontodrilus litoralis 
Diplotrema sp. 

Diplotrema acropetra 
Neodiplotrema altanmoui 
Dichogaster sp. 

Dichogaster sp. (sexprostatic) 
Eukerria saltensis 
Lumbricid sp. 


Fig. 4. — Maximum likelihood tree from the sum of the 28S rDNA and mtDNA partitions optimized for GTR-f model separately (SUM 
model best tree). Selected from 14586 near parsimonious reverse constraint trees. The maximum parsimony bootstrap tree is the 
same. Dashed lines indicate the two topology differences found in the combined data model maximum likelihood tree (COMBO 
model). Bold letters are clade labels for Table 4. Maximum parsimony and maximum likelihood boostrap values above and below 
branches respectively. Partition support shown in Table 4. 


Effect of model design 

The effect of model design (SUM versus 
COMBO models) as well as adding invariant 
sites and parameter optimization is graphically 
represented in the comparison of the relative InL 


among reverse constraint trees (Fig. 6). These 
trees are a subset of the SUM method search set. 
The deviation in relative likelihood among trees 
using fixed parameters versus parameters opti¬ 
mised for each tree is negligible (SUM versus 
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Table 4. — Separate and combined partition support for nodes in the SUM model tree. All values refer to the support by each data¬ 
set for clades defined in Fig. 4. Partition support refers to best reverse constraint tree for that method relative to SUM model best 
tree COMBO model InL split into 28S sites and mtDNA sites. Shading indicates conflict, c, expected likelihood weight confidence 
set from COMBO model reverse constraint trees and maximum likelihood best tree. Arranged in order of c, bold values indicate 95% 
confidence set. 


c Bootstrap Bremer Partition AlnL combined data 

separate models analysis 


COMBO COMBO 

MP 

all 

mt 

28S 

SUM 

mt 

28S 

all 

mt 

28S 

clade 

0.00 

100 

98 

14 

6 

8 

35.15 

28.32 

6.83 

34.67 

28.17 

6.50 

A 

0.00 

97 

97 

15 

4.5 

10.5 

15.95 

6.85 

9.10 

13.73 

6.98 

6.76 

C 

0.00 

89 

62 

7 

5 

2 

13.90 

18.48 

-4.59 

16.62 

10.59 

6.03 

F 

0.00 

98 

84 

10 

3.7 

6.3 

20.82 

12.12 

8.70 

22.54 

13.37 

9.16 

V 

0.00 

88 

69 

10 

8.5 

1.5 

3.09 

3.09 

0.00 

3.78 

5.31 

-1.53 

H 

0.00 

95 

96 

12 

10 

2 

12.10 

7.26 

4.83 

13.65 

6.21 

7.44 

T 

0.01 

95 

59 

0 

-4 

4 

11.09 

-0.75 

11.83 

12.80 

-1.27 

14.07 

D 

0.02 

46 

55 

4 

3 

1 

0.99 

-0.47 

1.45 

4.81 

1.88 

2.93 

Q 

0.02 

80 

84 

4 

3 

1 

0.99 

0.95 

0.05 

2.56 

3.02 

-0.45 

S 

0.02 

75 

77 

4 

-1 

5 

5.88 

3.26 

2.61 

4.09 

2.42 

1.67 

J 

0.03 

90 

69 

4 

3 

1 

7.13 

0.67 

6.46 

11.08 

1.56 

9.52 

B 

0.03 

59 

43 

4 

0.4 

3.6 

2.75 

3.00 

-0.26 

4.75 

4.36 

0.38 

K 

0.04 

71 

64 

2 

-2 

4 

4.09 

2.02 

2.07 

3.10 

0.19 

2.91 

U 

0.04 

27 

53 

3 

1 

2 

0.62 

0.62 

0.00 

-0.28 

0.85 

-1.13 

M 

0.04 

29 

89 

10 

6.7 

3.3 

2.75 

2.75 

0.00 

-0.28 

0.85 

-1.13 

R 

0.06 

59 

55 

2 

-5 

7 

2.54 

0.79 

1.75 

0.77 

1.33 

-0.56 

G 

0.06 

23 

14 

2 

0 

2 

2.23 

3.14 

-0.91 

1.50 

3.38 

-1.87 

L 

0.07 

79 

65 

5 

7 

-2 

0.88 

8.85 

-7.97 

4.56 

10.83 

-6.27 

E 

0.10 

77 

79 

7 

4 

3 

7.53 

4.68 

2.85 

6.63 

3.79 

2.84 

P 

0.10 

47 

31 

3 

-1 

4 

5.23 

5.23 

0.00 

5.58 

5.40 

0.18 

1 

0.10 

41 

36 

3 

0 

3 

3.59 

2.75 

0.84 

5.58 

5.40 

0.18 

O 

0.17 

39 

38 

3 

-1.5 

4.5 

3.19 

-3.40 

6.60 

1.53 

-4.41 

5.94 

N 



68.4 

64.4 

5.8 

2.3 

3.5 

7.4 

5.0 

2.4 

7.9 

5.0 

2.9 

average 
per node 


optimized), saving a considerable computational 
burden. While there can be substantial difference 
in relative likelihood between the SUM and 
COMBO models (average difference 2.8 InL), 
the InL support for each node in the ML tree 
seems less affected (see Table 4). The COMBO 
model is a substantially less good fit to the data 
(> 350 AlnL, d. f. = 108, p < 0.001; Wilgenbusch 
& De Queiroz 2000), although it cannot be reject¬ 
ed by the Goldman (1993) parametric test as an 
adequate representation of the real data (p > 0.26, 
100 replicates). Compared with this, introducing 


an invariant sites term in the model (as directed 
by standard likelihood ratio tests) produces little 
difference (average difference 1.1 InL). 

Partition support (Baker & DeSalle 1997) can be 
a guide to distribution of conflict or support. 
Table 4 provides such a summary of the similari¬ 
ties and differences between partitions and 
among methods for node support, providing a 
basis for interpreting these measures of support 
and conflict. 

The combined data tree resembles the mtDNA 
tree more than it does the 28S tree for both ML 
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and MP (comparing the symmetric difference 
metric of Penny & Hendy [1985]). The mtDNA 
contributes more to the relative InL but the 28S 
more to the Bremer score. Combined analysis, 
either SUM or COMBO methods, still shows 
conflict but also indicates hidden support: for 
example Fig. 4 nodes B, J, A, G — in conflict in 
separate analysis, are both supported in combined 
methods. The mtDNA has more conflict in 
Bremer; less in ML. Overall (as might be expect¬ 
ed of a simpler model) the COMBO model has 
slightly higher levels of support and conflict than 
the SUM resulting in very similar proportions of 
conflict; MP Bremer shows a greater proportion 
of conflict. A reduction in conflict in ML versus 
MP suggests a more complex model is appro¬ 
priate; reduction in conflict in the SUM versus 
COMBO methods may reflect reduced dynamic 
heterogeneity. In each method the conflict is 
spread across similar number of nodes 7-8/22 
with the majority concentrated into a few nodes. 
Each method has a slightly different mix of nodes 
and values among methods so that overall the 
component parts of 28S and mtDNA vary bet¬ 
ween the two methods more than the total AlnL 
and rank order of the nodes. Of the combina¬ 
tions of conflict: 1) strong conflict in all methods 
(e.g., Fig. 4, nodes E, N), all in the same direc¬ 
tion, may indicate historical heterogeneity; 2) dif¬ 
ference in conflict between COMBO and SUM 
(e.g., Fig. 4, nodes F, G, H, M, R, S), most of 


which is slight (except F) may indicate dynamic 
heterogeneity. 

Figure 5 shows the effect of different types of par¬ 
tition support on consensus: Fig. 5A is consensus 
of independent 28S and mtDNA ML trees; Fig. 
5B consensus of SUM model reverse constraint 
trees that do not show conflict; Fig. 5C consensus 
of COMBO model reverse constraint trees that 
do not show conflict. When combined the parti¬ 
tions show less conflict than apart, indicating hid¬ 
den support (Baker & DeSalle 1997; Lee & 
Flugall in press), reflected in the number of nodes 
resolved (11 vs 14). The SUM and COMBO 
models differ in the position of Eukerria saltensis 
- the SUM model is perhaps to be preferred as: a) 
AlnL and bootstrap values are weak in the 
COMBO model; and b) it is a more conventional 
arrangement - outside the Megascolecidae s.l. 

Tests of molecular support for specific morphological 
classifications 

The trees so far presented provide evidence 
against a number of morphological classifica¬ 
tions. Overall data appear to combine so that the 
whole is greater than the sum of the parts but the 
bootstrap support for many nodes is weak. Flere 
we test several morphological hypotheses of 
monophyly based on the likelihood ratio of the 
ML tree versus the best tree constrained to each 
of the alternative hypotheses. The possible genuine 
conflicts suggested in the partition analyses are 


Table 5. — Comparison of likelihood ratio tests, f, best trees, fixed parameters; *, SOWH test, 100 replicates, fixed parameters; c1, 
expected likelihood weight Strimmer & Rambaut (2002). Shading indicates rejected at a < 0.05; dash = test not done; § 28S, Fig. 3 
reduced taxa set data, Shimodaira-Hasegawa test only; 1,28S; 2, mtDNA; 3, COMBO. 




AlnL 



BS MLf KH testt 

SH testt 

Parametric test’ 

Ir 

cH 

Hypothesis of 
monophyly 

1 

2 

3 

1 

2 

3 1 

2 3 

1 2 3 

1 

2 

3 

1 

2 3 

Acanthodrilidae 

62.6 

67.5 

126.3 

0.00 

0.00 

0.00 <0.01 <0.01 <0.01 <0.01 <0.01 <0.01 

- 

- 

- 

<0.01 <0.01 <0.01 

Dichogastrini 

33.9 

53.4 

88.5 

0.01 

0.01 

0.00 0.01 

0.00 <0.01 

0.10 <0.01 <0.01 

- 

- 

- 

0.01 

<0.01 <0.01 

Octochaetidae 

7.0 

14.3 

21.0 

0.25 

0.03 

0.00 0.57 

0.10 0.02 

0.58 0.40 0.37 

0.01 

<0.01 

- 

0.26 

0.03 <0.01 

Racemose 














Megascolecidae 

16.7 

19.0 

31.5 

0.03 

0.05 

0.03 0.08 

0.09 0.05 

0.25 0.26 0.16 <0.01 

<0.01 

- 

0.02 

0.06 0.03 

Aquamegadrili § 

28.2 

- 

- 

- 

- 

- - 

- 

0.032 - 

- 

- 

- 

- 

- 

Lumbricoidea § 

35.7 

- 

- 

- 

- 

- 

- 

0.011 - 

- 

- 

- 

- 

- 
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not relevant to the hypotheses tested here. 
Generic names given in the tests represent the 
species listed for those genera in Table 1; author 
names for these genera are not therefore required. 
Table 5 summarizes the information on likeli¬ 
hood ratios tests both non-parametric (bootstrap 
distribution and derived KH [Kishino- 
Hasegawa], SH [Shimodaira-Hasegawa] and 
expected likelihood weight test statistics) and 
parametric tests (SOWH). All tests use fixed para¬ 
meter values optimized to the hypothesis in ques¬ 
tion. Results in Figure 6 comparing fixed versus 
optimized parameters indicate that the use of the 
RELL (resampling estimated log-likelihood) 
approximation is reasonable. The following tests 
within the Megascolecidae use the 25 taxa in 
Figure 4. We may invoke congruence between the 
mtDNA and the 28S as additional confidence. 
The first two megascolecid classifications (1 and 2 
below) are inconsistent with the consensus tree 
Figure 5A; the others are unresolved; all are 
inconsistent with the SUM and COMBO 
consensuses. Bootstrap values against these hypo¬ 
theses are generally low for separate data-sets but 
high for the combined data; the SH test is signifi¬ 
cant for only the strongest tests while the parame¬ 
tric tests are significant for weakest, and therefore 
certainly significant for all. The expected likeli¬ 
hood weight 95 % confidence set rejects most. 
Hypothetical taxonomic groupings tested by 
maximum likelihood are as follows: 

1. Acanthodrilidae Vejdowski, 1884 as redefined 
by Gates (1959, 1972): ( Diplotrema, Diporochaeta 
sp., Fletcherodrilus, Diporochaeta kershawi, Ponto- 
drilus) versus the rest. 

2. Monophyly of the Dichogastrini Jamieson, 
1971: ( Dichogaster, Didymogaster, Digaster ) versus 
the rest. 

1 and 2. Both the 28S and the mtDNA reject 
monophyly of the Dichogastrini Jamieson, 1971 
and the Acanthodrilidae sensu Gates 1959. 

3. Octochaetidae sensu Gates 1959 (Gates 1959, 
1972): ( Dichogaster, Neodiplotrema) versus the 
rest. 

4. Racemose prostate Megascolecidae: ( Spenceriella , 
Didymogaster, Heteroporodrilus, Digaster, Begemius, 
Perionyx) versus the rest. 


The cases of the Octochaetidae and of the race¬ 
mose Megascolecidae highlight the differences in 
methods of assessing support. Neither 28S 
mtDNA nor combined data trees are consistent 
with either hypothesis. BS support against is high 
(95 %, 88 % respectively) for combined data but 
low for partitions (e.g., 40 % in 28S); the KH 
test shows a trend towards significance, not appa¬ 
rent in the SH test; all are highly significant in 
the parametric test, but only 4/6 according to the 
expected likelihood weight. In conclusion, the 
Octochaetidae and Megascolecidae of Gates are 
not supported. 

Two higher classifications within the Crassi¬ 
clitellata can also be tested using the 28S data 
(Fig. 2): 

5. Aquamegadrili Jamieson, 1988 - Terrimega- 
drili Jamieson, 1988: ( Sparganophilus, Criodrilus, 
Lutodrilus) versus the rest. 

6. Lumbricoidea: ( Lumbricus, microchaetid, glos- 
soscolecid, Hormogaster) versus the rest. 

Both are rejected by the conservative SH test. 
The 28S data-set is inconsistent with monophyly 
of the Aquamegadrili (aquatic megadriles) owing 
to the inclusion of Komariekiona as sister-taxon 
of Sparganophilus. The Lumbricoidea is rejected 
principally because the glossoscolecid never 
groups with the remainder in the best trees but 
with Eudrilus. 

DISCUSSION 

Consensus, combining data and likelihood 

MODEL DESIGN 

Concerning model choice and combining of data, 
we applied a three-way procedure: 1) consensus of 
separate analyses; 2) combine data in the 
COMBO model; and 3) combine likelihoods in 
the SUM method. The separate analyses allowed 
each partition to be analysed under its own opti¬ 
mum model but the SUM method constrains the 
data in such a way that it chooses among combi¬ 
ned score best trees. Multi-models represents a 
design “parameter” that is of more substantial 
influence than other parameters available (Fig. 6) 
but despite this we find that, in this particular case 
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A 



Fletcherodrilus sigillatus 
F. unicus 

Terrisswalkerius millaamillaa 

T. phalacrus 

T. kuranda 

T. windsori 

T. grandis 

Spenceriella sp. 

S. cormieri 

Begemeus queenslandicus 
Pontodrilus litoralis 
Diporochaeta sp. 

Digaster anomala 
Diporochaeta cf. kershawi 

T. athertonensis 
Didymogaster sylvaticus 
Fleteroporodrilus sp. 

Perionyx excavatus 
Diplotrema sp. 

D. acropetra 

Neodiplotrema altanmoui 
Dichogaster sp. 

Dichogaster sp. (sexprostatic) 
Eukerria saltensis 
Lumbricid sp. 



Fletcherodrilus sigillatus 
F. unicus 

Terrisswalkerius millaamillaa 
T. phalacrus 
T. windsori 
T. grandis 
T. kuranda 
Diporochaeta sp. 

Digaster anomala 
Diporochaeta cf. kershawi 
T. athertonensis 
Didymogaster sylvaticus 
Spenceriella sp. 

S. cormieri 
Heteroporodrilus sp. 
Begemeus queenslandicus 
Perionyx excavatus 
Pontodrilus litoralis 
Diplotrema sp. 

D. acropetra 

Neodiplotrema altanmoui 
Dichogaster sp. 

Dichogaster sp. (sexprostatic) 
Eukerria saltensis 
Lumbricid sp. 



Fletcherodrilus sigillatus 
F. unicus 

Terrisswalkerius millaamillaa 
T. phalacrus 
T. windsori 
T. grandis 
T. kuranda 
T. athertonensis 
Didymogaster sylvaticus 
Spenceriella sp. 

S. cormieri 
Fieteroporodriius sp. 
Diporochaeta sp. 

Digaster anomala 
Diporochaeta cf. kershawi 
Begemeus queenslandicus 
Perionyx excavatus 
Pontodrilus litoralis 
Diplotrema sp. 

D. acropetra 

Neodiplotrema altanmoui 
Dichogaster sp. 

Dichogaster sp. (sexprostatic) 
Eukerria saltensis 
Lumbricid sp. 


Fig. 5. — Semistrict consensus trees for the Megascolecid 28S 
and mtDNA data-sets; A, consensus of 28S and mtDNA 
maximum likelihood trees; B, consensus of SUM model reverse 
constraint tree showing conflict between 28S and mtDNA 
partitions (see Table 4); C, COMBO model partition conflict 
consensus. 


at least, the results presented are insensitive to the 
details of the choice of model. This is especially 
seen in the similarity of AlnL between the SUM 
and COMBO methods, and that the ML and MP 
results are very similar (Table 4). The different 
characteristics of the nuclear and mtDNA combi¬ 
ne to give a model more similar to the MP 
approach than either data-set, and the variance in 
the GTR-T model is sufficient to accommodate all 
sites. Congruence among the methods, and more 
particularly, the two data-sets (nuclear and mito¬ 
chondrial) can then be invoked as support for the 
robustness of our findings. 

By combining data at the level of tree space, the 
SUM method allows each partition maximum 
independence, without imposing any assump¬ 
tions of one upon the other. An intermediate 
procedure would be to employ separate models 
for each partition but to use a combined data 
estimate of branch length. The general principle 
of combining information at the level of (relative) 
likelihoods of topologies provides a way of com¬ 
bining disparate data, providing a model. Some 
examples are: combining the PHYLIP likelihood 
method for analysing allele frequency data (allo- 
zymes, etc.); DNA-DNA hybridization data; ML 
method for morphology (Lewis 2001). Most of 
these types of data cannot directly be combined 
into a matrix of characters with states (even 
applying mixed models) but they do provide rela¬ 
tive likelihood for topologies - the likelihood 
support surface. Likelihood is recognized as suited 
to combining disparate types of data - combi¬ 
ning the likelihood support surface by adding up 
the likelihoods contributed by each type of data 
for each hypothesis. Further, it provides a distinct 
way of dealing with data-sets that are not comple¬ 
tely matched - topologies involving taxa that are 
not represented will have a flat support surface, as 
opposed to the alternate procedure in the combi¬ 
ned data model of integrating over all possible 
states (Swofford 2000). The multi-model parti¬ 
tion analysis is inconvenient to apply because of 
the lack of automation. For further use and ela¬ 
boration (such as bootstrapping) it needs to be 
computerized; apparently under development for 
future versions of PAUP*. 
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reverse constraint trees 



Fig. 6. — Effect of model design on likelihood support surface across reverse constraint trees. Relative InL of a selection of near par¬ 
simonious reverse constraint trees compared to the best: SUM model tree score for the SUM model; the SUM GTR-r models with 
fixed parameters; same but with extra invariant sites parameter; SUM GTR-r models with optimized parameters for each topology; 
COMBO model fixed parameters. Line joining individual AlnL values is to aid visualization. 


The partition analysis identifies the distribution 
of conflict, amongst data-sets and amongst 
methods: much is spread across nodes at low 
levels and can be dismissed as due to data limita¬ 
tions. If some of the conflict is due to dynamic 
heterogeneity this may be reduced in the SUM 
method, while localized consistent conflict across 
methods needs to be identified. Even though 
nodes may be reasonably supported in a combined 
data analysis such nodes must remain questio¬ 
nable (e.g., Fig. 4, node E). 

Congruence among data can be powerful evi¬ 
dence. However, direct consensus of independent 
results can obscure underlying similarities. On 
the other hand, profound conflict can be obscured 
in the combined data approach. Both aspects are 
revealed in the consensuses representing three 
levels of combining information shown in Figure 
5A, B, C respectively: 1) completely separate; 
2) separate models but arbitrating over combined 


information results; and 3) combined model and 
data, subsequently partitioned. 

Given the current state of uncertainty as to the 
robustness and sensitivity of phylogenetic likeli¬ 
hood ratios tests, we have taken the opportunity 
to present the AlnL, the of BS distribution values 
(proportion of resamples a tree is better than cho¬ 
sen ML tree, see also Stuart-Fox et al. [2002]), 
the derived KH, SH and c statistics, and the para¬ 
metric (SOWH) test (Table 5). The uncertainty 
hinges on plausibility of null hypotheses for non- 
parametric tests and on how well the model fits 
the data for the parametric test. We explored this 
in Table 5, and Figure 6 together with likelihood 
ratio tests. As most of the model lies in the data 
patterns, the fixed parameter approximation is 
worth the trivial sacrifice in precision; as the 28S 
and mtDNA are very different, the multi-model 
(SUM) method of considerable gain. While the 
GTR-T model may be considered optimized, the 
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difference between parametric versus non-para- 
metric tests (Table 5), suggests it is still too far 
from reality to justify parametric methods. Here 
the most noticeable difference between the para¬ 
metric and the real data is distribution of apo- 
morphies - more clustered in real data suggesting 
lack of independence among sites as assumed by 
the model (not shown). All the other approaches 
use non-parametric pseudoreplicates, which 
always have fewer patterns - the principle deter¬ 
minant of the likelihood model. The KH test has 
been criticized as unfair to the null hypothesis 
(Goldman et al. 2000) but on the other hand the 
SH test can be weakened to the point of futility 
by including highly unlikely trees (Goldman et 
al. 2000; Strimmer & Rambaut 2002). The 
expected likelihood weighting essentially reflects 
the primary BS AlnL distribution (the simplest 
form of all these tests) and captures most of the 
consensus of information (Tables 4; 5). 

Significance to taxonomy and 

MORPHOLOGICAL CLASSIFICATIONS 
Crassiclitellata Jamieson, 1988 
That the Crassiclitellata of Jamieson (1988), 
named for families in which the clitellum is multi¬ 
layered, is a monophylum is confirmed in the 
molecular analyses. The MP majority rule 
consensus tree for 549 bp of 28S only (Fig. 2) 
gives 100% bootstrap support for the Crassiclitel¬ 
lata versus outgroup taxa. Maximum likelihood 
analysis differs little from a MP bootstrap tree. 
Eudrilus again associates with the glossoscolecid 
Gen. sp. but not with Microchaetus. 

Aquamegadrili Jamieson, 1988 and 
Lumbricoidea sensu Jamieson 1971 
The Aquamegadrili, named by Jamieson (1988) 
for aquatic crassiclitellates (see Introduction for 
geographic distribution), consist of the 
Sparganophilidae, Biwadrilidae, Almidae (mostly 
warm tropics, including Criodrilus, Mediter¬ 
ranean region, etc.) and Lutodrilidae (Southern 
Neartic). In the likelihood ratio tests the 28S 
data-set is inconsistent with monophyly of the 
Aquamegadrili but this is entirely due to the 
inclusion of Komarekiona as sister-taxon of 


Sparganophilus. The 28S data are inconsistent with 
partition of crassiclitellates into Aquamegadrili 
and Terrimegadrili (see families listed in 
Introduction); instead the representatives 
(, Sparganophilus , Criodrilus and Lutodrilus) of the 
original aquamegadrile taxa, with Komarekiona, 
lie within a paraphyletic Terrimegadrili (Figs 2; 
3). The Lumbricoidea is incompatible with the 
28S data but principally because the glossoscole¬ 
cid sp. never groups with the remainder in the 
best trees but groups equivocally with Eudrilus. 
However the unity of the majority of the remain¬ 
ing Lumbricoidea will necessarily require more 
sampling. 

Eudrilidae Claus, 1880 

The family Eudrilidae has formerly been associated 
with the Megascolecidae and Ocnerodrilidae in a 
superfamily Megascolecoidea (Jamieson 1978) or 
in a separate superfamily Eudriloidea as the sister- 
group of the Lumbricoidea + Megascolecoidea s.s., 
in the morphocladistic analysis (Jamieson 1988). 
In the present analysis, Eudrilus always has the 
glossoscolecid Gen. sp. (a lumbricoid sensu 
Jamieson 1978, 1988) as its sister-taxon, with 
moderate BS support of 63-84 % (MP, ML, Figs 
2; 3). The other glossoscolecid, Pontoscolex, may or 
may not link with these. There is no molecular 
support here for regarding eudrilids as the unique 
sister-group of the Ocnerodrilidae + Megasco¬ 
lecidae assemblage. Omodeo (2000) derived eudri¬ 
lids independently (from alluroidids), thus also 
noting their distinctness but that origin goes 
contrary to the present confirmation of the mono- 
phyletic nature of the Crassiclitellata. 

Megascolecoidea (Megascolecidae Rosa, 1891 + 
Ocnerodrilidae Beddard, 1891) 

In the morphocladistic analysis (Jamieson 1988), 
the superfamily Megascolecoidea contained only 
the Megascolecidae, with the two subfamilies 
Acanthodrilinae and Megascolecinae, and 
excluded the Ocnerodrilidae and Eudrilidae. 
However, all three of these families had been ten¬ 
tatively included in the superfamily Megascole¬ 
coidea in Jamieson (1978, 1980). Recognition by 
Lee (1959), Jamieson (1971a-c, 1978, 1980), 
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and Sims (1966, 1967) of an Acanthodrilinae + 
Megascolecinae assemblage together with the 
Ocnerodril(-inae) (-idae) is endorsed in the pre¬ 
sent molecular analyses. Monophyly of the ocne- 
rodrile exemplar Eukerria saltensis with the 
Megascolecidae (Acanthodrilinae + Megascole¬ 
cinae) is supported in all cladistic analyses that 
have additional outgroups. This relationship of 
the Ocnerodrilidae with the Megascolecidae has 
> 90% BS support in MP trees (see Figs 2; 4, also 
Jamieson 2000). In view of the sister-group rela¬ 
tionship of the Eukerria saltensis with the 
Megascolecidae s.l. , it appears that the family 
Ocnerodrilidae may be included in the 
Megascolecoidea, rather than having a supra- 
familial rank of its own, but this needs wider 
sampling to test monophyly, and the relation¬ 
ships of the two ocnerodrile tribes Ocnerodrilini 
and Malabarini. 

Megascolecidae sensu Gates 1972 
With regard to previous classifications of the 
Megascolecidae the following conclusions can be 
drawn from the present molecular analyses. The 
widely used system for internal classification of the 
Megascolecidae of Gates (1959, 1972) cannot be 
sustained, as already argued by Lee (1959, 1970) 
and Jamieson (1971 a-c). Diagnosis of Megas¬ 
colecidae in the restricted sense of Gates (1972) by 
the possession of racemose prostates (with holo- or 
meronephridia) is not supported in any MP trees, 
with multiple evolution of racemose and tubular 
prostates implied by the molecular phylogenies 
(Figs 3; 4). Flowever, bootstrap values are low in 
the relevant section of the trees and, as no species 
with tubular prostates has a high bootstrap linkage 
with a species with racemose prostates [notwith¬ 
standing the Terrisswalkerius athertonensis- 
Didymogaster clade]. Likelihood ratio tests are 
equivocal on this but we suggest this is more a 
reflection on the SF1 test as overly conservative. 

Octochaetidae sensu Gates 1972 and Acantho- 
drilidae sensu Gates 1972 

The Octochaetidae were defined by Gates (1972) 
and, as the Octochaetinae, by Gates (1959) and 
Sims (1967), as all species with tubular prostates 


and more than one pair of nephridia per segment 
(meronephridia). Gates (1959, 1972) diagnosed 
the Acanthodril(-inae), (-idae), as having tubular 
prostates and a pair of nephridia (holonephridia) 
per segment. The relationship of the merone- 
phric Neodiplotremz with the holonephric 
Diplotrema in all trees (either 28S, mtDNA or 
combined) argues against recognition of the 
Octochaet(-idae), (inae), and against the 
Acanthodril(-inae), (-idae) of Gates and of Sims. 
The Neodiplotrema + Diplotrema clade fully 
endorses the conclusion by Lee (1959, 1970) that 
phylogenetic pairs of holonephric with merone- 
phric species of Acanthodrilinae are recognizable. 
We will now further consider subdivision of the 
Megascolecidae into the subfamilies Acantho¬ 
drilinae and Megascolecinae and division of the 
Megascolecinae into the tribes Perionychini, 
Dichogastrini, and Megascolecini in the classifi¬ 
cation of Jamieson (197 la-c). 

Acanthodrilinae sensu Jamieson 1971a 
The definition of the subfamily Acanthodrilinae 
sensu Jamieson (1971a) differs fundamentally 
from that of Gates (1959, 1972). Prostates are 
not only tubular but may also (rarely) be racemose 
and nephridia are not only holonephridia but 
may also be meronephridia. Unlike many acan- 
thodrilids of Gates, the prostates usually do not 
discharge on segment 18 (doing so in Rhodo- 
drilus ), typically opening on segments 17 and 19, 
as in the type-genus. Megascolecinae with 
homeotic displacement of male pores may corres¬ 
pond with this definition but show their affinities 
with megascolecines in other respects. Posterior 
nephridia in Acanthodrilinae sensu Jamieson 
(1971a) lacked the median funnel diagnostic of 
the Dichogastrini within the Megascolecinae 
sensu Jamieson. The alimentary and vascular sys¬ 
tems differed from those of the Ocnerodrilinae in 
some of which, as in Eukerria Michaelsen, 1935, 
the male and prostate pores have the acanthodrilin 
arrangement. It has been shown in the molecular 
analysis that dichogastrins with acanthodrilin 
male pores must be transferred to the Acantho¬ 
drilinae and that the dichogastrin nephridial 
condition has arisen more than once (see below). 
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Megascolecinae 

Perionychini Jamieson, 1971. The tribe 
Perionychini (defined by megascolecin male pores 
and holonephridia, irrespective of prostate type) is 
represented here by five genera. There is no 
convincing support in any analysis for a unified or 
monophyletic Perionychini although there is evi¬ 
dence that some of these holonephric megascole- 
cines are closely related. There is, however, much 
instability in perionychin relationships, reflected in 
low BS values and lack of consensus (Figs 3-5). In 
all analyses the “perionychin” (acanthodrile sensu 
Gates) Pontodrilus diverges at or near the base of 
the Megascolecinae s.l., and while exclusion from 
the Acanthodrilinae is upheld, its uncertain place¬ 
ment reflects its enigmatic affinities on morpholo¬ 
gical and ecological grounds. Pontodrilus is highly 
unusual among crassiclitellates in being euryhali- 
ne. It may be suspected of having had a long 
independent evolution. The tribe Perionychini, 
although a convenient taxonomic grouping, is thus 
a para- or polyphyletic assemblage in the molecu¬ 
lar analyses. This has previously been suspected 
(Jamieson 1988) as the Perionychini is recognized 
on the basis of the symplesiomorphic possession of 
holonephridia (a condition seen throughout the 
Oligochaeta, whereas meronephry is virtually limi¬ 
ted to Megascolecidae). It is therefore a grade 
rather than a clade. 

Dichogastrini Jamieson, 1971 and Acan¬ 
thodrilinae Vejdovsky, 1884. The Dicho¬ 
gastrini were defined by presence of a single 
stomate meronephridium median to astomate 
micromeronephridia on each side in caudal seg¬ 
ments, in the absence of posterior enteronephry 
(Jamieson 1971a). It has, however, repeatedly 
been questioned (e.g., Jamieson 1978, 1981; 
Dyne 1984) that dichogastrins with acanthodri- 
lin male pores (here represented by Dichogaster ) 
are monophyletic with those with megascolecin 
pores (e.g., Digaster). 

The present analyses relegate “acanthodrilin 
dichogastrins” ( Dichogaster and Neodiplotremd) 
to the Acanthodrilinae and “megascolecin dicho¬ 
gastrins” (albeit represented only by Digaster and 
Didymogaster) to the Megascolecinae sd. in com¬ 


bined and separate analyses for both mtDNA and 
28S with overall good support (e.g., BS support 
of 77% in the combined ML tree, Fig. 4). 

The Nearctic Diplocardia longiseta appears to lie 
within this “acanthodrilin” clade (83% BS, 
Fig. 3) but the only available data (the 28S) are 
not sufficient for further resolution and more taxa 
are required. Diplocardia Garman, 1888 has been 
shown by James (1990) to be closely similar in 
morphology to Diplotrema. The available 12S 
data support the argument (Jamieson 1995) that 
Rhododrilus glandifera, in the Wet Tropics of 
Queensland, is locally derived from a precursor 
with the acanthodrilin arrangement of male pores 
(probably Diplotrema with which it has a 99% BS 
value in Fig. 3) though this requires confirmation 
from analysis of larger numbers of sequences. 
R. glandifera thus appears to deserve a subgeneric 
rank in Diplotrema or generic rank separately 
from Rhododrilus, the type-locality of which is in 
New Zealand. 

All trees from combined data endorse recognition 
of the Acanthodrilinae for worms with acantho¬ 
drilin male pores, including acanthodrilin 
Dichogastrini ( Dichogaster ) (though there is lack 
of resolution for 28S, Fig. 2) but not those ocne- 
rodriles ( Eukerria ) with acanthodrilin or other 
male terminalia. The ocnerodriles (albeit repre¬ 
sented only by Eukerria ) are phylogenetically dis¬ 
tinct in the present study and are well-defined 
morphologically. 

Megascolecini sensu Jamieson 1971. The third 
tribe of the Megascolecidae, the Megascolecini, 
was defined by having male and prostate pores 
coincident on segment 18 (rarely segment 17), 
and meronephry in which a median stomate 
nephridium, if present, differed from those of 
dichogastrins in opening into the intestine (ente¬ 
ronephry). Prostates were racemose, tubular or 
tubuloracemose (Jamieson 1971a-c). In contrast. 
Gates (1959, 1972) attributed only worms with 
racemose prostates, irrespective of nephridial 
types, to his restricted Megascolecidae. 

Resolution of the Megascolecini was not an aim of 
the present work and as few representatives have 
been included ( Amynthas Kinberg, 1867, Begemius 
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Easton, 1982, Propheretima Jamieson, 1995, and 
Spenceriella Michaelsen, 1907) results must be 
regarded with caution. However, none of the ana¬ 
lyses supports retention of the Megascolecini as 
defined by Jamieson (1971a-c). It is to be expected 
that the criteria of meronephry with enteronephry 
may have evolved more than once. A core of 
megascolecin genera, including among others 
Begemius, and Amynthas, is suspected to be mono- 
phyletic however (Jamieson 1981), though it no 
longer appears that Spenceriella is as close to the 
pheretimoids as previously argued. 

Relevance to Clitellata (oligochaete, 

LEECH AND BRANCHIOBDELLID) RELATIONSHIPS 
Morphological and molecular support for para- 
phyly or polyphyly of the Oligochaeta and inclu- 
sion within this group of leeches and 
branchiobdellids has been outlined in the 
Introduction. Our 28S data are consistent with a 
leech-branchiobdellid-lumbriculid clade, within 
the Oligochaeta, less so for a leech-branchiobdellid 
grouping, considering the variation among 
methods. Although long branch/rate accelera¬ 
tion/base content has posed problems for the 
nuclear ribosomal genes analyses, consistency 
with data of different characteristics (mtDNA 
and COI) underlines these relationships. The 
lumbriculid relationship was proposed on mor¬ 
phological grounds by Michaelsen (1928-1932), 
Brinkhurst & Nemec (1986), Brinkhurst & 
Gelder (1989), Brinkhurst (1999a) and, less cer¬ 
tainly, by Brinkhurst (1999b) and from molecu¬ 
lar data by Siddall & Burreson (1998), Martin 
(2001) and Siddall et al. (2001). If accepted, the 
present phylogeny would confirm the 
Oligochaeta as a paraphyletic group as previously 
mooted (Jamieson et al. 1987; Jamieson 1988; 
Martin 2001), being merely the non-leech, non- 
branchiobdellidan clitellates (Purschke et al. 
1993). Inclusion of leeches and branchiobdellids 
within the Oligochaeta would thus render the 
name Oligochaeta synonymous with Clitellata 
(or Euclitellata of Jamieson 1983), as proposed 
by Siddall et al. (2001) and in a study of molecu¬ 
lar phylogeny of the Tubificidae (subsuming the 
Naididae) by Erseus et al. (2002). 
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